Conditional Random Sampling: A Sketch-based Sampling Technique for Sparse Data

نویسندگان

  • Ping Li
  • Kenneth Ward Church
  • Trevor J. Hastie
چکیده

Abstract We1 develop Conditional Random Sampling (CRS), a technique particularly suitable for sparse data. In large-scale applications, the data are often highly sparse. CRS combines sketching and sampling in that it converts sketches of the data into conditional random samples online in the estimation stage, with the sample size determined retrospectively. This paper focuses on approximating pairwise l2 and l1 distances and comparing CRS with random projections. For boolean (0/1) data, CRS is provably better than random projections. We show using real-world data that CRS often outperforms random projections. This technique can be applied in learning, data mining, information retrieval, and database query optimizations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

One sketch for all: Theory and Application of Conditional Random Sampling

Abstract Conditional Random Sampling (CRS) was originally proposed for efficiently computing pairwise (l2, l1) distances, in static, large-scale, and sparse data. This study modifies the original CRS and extends CRS to handle dynamic or streaming data, which much better reflect the real-world situation than assuming static data. Compared with many other sketching algorithms for dimension reduct...

متن کامل

Accelerating Magnetic Resonance Imaging through Compressed Sensing Theory in the Direction space-k

Magnetic Resonance Imaging (MRI) is a noninvasive imaging method widely used in medical diagnosis. Data in MRI are obtained line-by-line within the K-space, where there are usually a great number of such lines. For this reason, magnetic resonance imaging is slow. MRI can be accelerated through several methods such as parallel imaging and compressed sensing, where a fraction of the K-space lines...

متن کامل

A Block-Wise random sampling approach: Compressed sensing problem

The focus of this paper is to consider the compressed sensing problem. It is stated that the compressed sensing theory, under certain conditions, helps relax the Nyquist sampling theory and takes smaller samples. One of the important tasks in this theory is to carefully design measurement matrix (sampling operator). Most existing methods in the literature attempt to optimize a randomly initiali...

متن کامل

A Random Sequence Generation Method for Random Demodulation Based Compressive Sampling System

Random demodulation based compressive sampling technique is a novel approach that it can break through the Shannon sampling theorem for the sparse signal capturing. A major challenge in the random demodulation based sampling system is the random sequence generation. In this paper, we introduce an approach to generate the high-speed random sequence that meets the incoherence of compressive sampl...

متن کامل

Recovery of Seismic Wavefields Based on Compressive Sensing by an l1-Norm Constrained Trust Region Method and the Piecewise Random Sub-sampling

SUMMARY Due to the influence of variations in landform, geophysical data acquisition is usually sub-sampled. Reconstruction of the seismic wavefield from sub-sampled data is an ill-posed inverse problem. Compressive sensing can be used to recover the original geophysical data from the sub-sampled data. In this paper, we consider the wavefield reconstruction problem as a com-pressive sensing and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006